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Estimation  of  Pulse  Availability 

Gary  A.  Pryor 

U.S.  Army  Training  and  Doctrine  Command,  Fort  Leonard  Wood,  Missouri 

Operational  availability,  A„  is  an  important  consideration  during  the  evaluation  of  system 
effectiveness  and  sustainability.  A„  is  commonly  used  and  widely  understood  as  a  measure  of 
steady-state  system  availability.  However,  a  related  term  “pulse  availability,” Ap,  has  recently 
and  more  prominently  been  used  in  statements  of  requirements  for  some  military  systems  under 
development.  Ap  is  a  subset  of  A^  and  applies  to  a  shorter  and  usually  more  intensive  usage 
period,  known  as  a  “pulse.”  Various  techniques  are  published  on  the  estimation,  calculation,  and 
measurement  ofA„;  however,  little  if  any  can  be  found  on  the  calculation  and  estimation  of  pulse 
availability,  Ap.  This  article  presents  a  straigh  forward  approach  for  estimating  Ap  which  will 
aid  in  the  specification  and  evaluation  of  Ap  requirements. 


Key  words:  Availability,  downtime,  uptime, 
operational  time. 

Operational  availability,  A^,  is  widely 
used  as  a  readiness  related  objective 
in  the  specification  of  requirements 
for  military  systems.  In  general 
terms,  A^  is  the  proportion  of  time 
a  system  is  either  operating  or  is  capable  of  operating 
(called  “uptime”),  while  being  used  in  a  specific  manner 
in  a  typical  maintenance  and  supply  environment.  In 
other  words,  A,^  is  the  ratio  of  “uptime"  to  “total  time!' 
For  complete  definitions  and  discussion  of  A^,  see 
Pryor  (2008). 

Pulse  availability,  Ap,  is  a  subset  of  operational 
availability  that  applies  when  the  period  of  interest  is 
not  the  steady-state  availability,  but  system  availability 
during  a  short,  usually  intensive  period  of  operations. 

Figure  1  shows  a  typical  “failure-restore”  cycle, 
which  can  be  used  to  model  steady-state  operational 
availability.  This  cycle  is  theoretically  repeated  contin¬ 
uously  because  a  system  operates,  experiences  a  failure, 
and  accumulates  downtime  associated  with  mainte¬ 
nance  and  logistics  delays  until  it  is  restored  to  an 
operational  state. 

Availability  must  be  looked  at  differently  when  the 
period  of  interest  is  not  steady  state,  but  a  shorter 
period,  such  as  a  period  of  intensive  usage.  Pulse 
availability,  Ap,  is  usually  higher  than  the  steady-state 
A„  for  the  same  operating  tempo.  When  a  failure  occurs 
during  the  pulse,  some  of  the  downtime  associated 
with  the  restoration  of  the  system  may  extend  outside 
of  the  pulse  and  is  not  counted  as  downtime  for  the 
pulse.  One  such  possible  example  is  illustrated  in 


maintenance  time,  failure,  time  to  restore. 


Figure  2.  The  amount  of  any  increase  in  Ap  over  A^ 
depends  primarily  on  the  length  of  the  pulse,  the 
expected  downtime  per  failure,  and  the  steady-state  A^. 

Estimation  of  operational  availability 

Two  related  A^  equations  were  described  in  previous 
published  work  by  the  author  (Pryor,  2008)  and  are 
shown  here  as  Equations  1  and  2.  The  specific 
equation  or  methodology  used  to  determine  A^  is  of 
no  importance;  but  Equations  1  and  2  are  shown  as  one 
possible  source. 

_  1  —  OPR  X  CMRess 

““  1-|-OPR(MCMT-fADLT)/MTBF’  ^  ’ 

a  _ _ 1 _  O') 

°  1-|-(MCMT  +  ALDT)/MTBF-|-CMRess 

where 

ALDT  =  administrative  and  logistics  delay  time 
(per  failure)  in  hours. 

CMRess  ~  clock  hour  maintenance  ratio — essential 
(maintenance  time  not  associated  with 
a  critical  failure  that  causes  downtime; 
expressed  as  maintenance  clock  hours 
per  operating  hour).  Includes  both 
essential  scheduled  (preventive)  main 
tenance  and  unscheduled  (corrective) 
maintenance. 

MCMT  =  mean  corrective  maintenance  time  (per 
failure)  in  hours. 
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Figure  1.  Steady-state  failure-restore  cycie. 


MTBF  =  mean  time  between  failure  (represent¬ 
ing  “critical”  failures  that  cause  the 
system  to  be  nonoperational)  in  hours. 

OPR  =  operating  rate  (ratio  of  operating  time 
to  total  time;  i.e.,  a  system  that  operates 
12  hours  per  day  would  have  an  OPR 
of  50  percent). 

If  CMRess  -  [(1  -  OPR)/OPR]  <  0,  then 
Equation  1  can  be  used;  otherwise  use  Equation  2. 
(Equation  2  is  used  for  continuously  operating  or  high 
op-tempo  systems.) 

Estimation  of  pulse  availability 

A-p  wiU  always  be  greater  than  or  equal  to  the  steady- 
state  Ao  because  some  of  the  downtime  that  was 
induced  during  the  pulse  will  extend  outside  of  the 
pulse,  and  therefore  is  not  counted  against  the  pulse  A^. 
The  difference  between  Ap  and  steady-state  A^  varies 
depending  on  the  reliability  relative  to  the  pulse;  the 
relative  duration  of  the  downtime;  and,  of  course,  the 
planned  usage  during  the  pulse. 

If  the  MTBE  is  significantly  greater  than  the  length 
of  the  pulse,  there  is  a  high  chance  of  completing  the 
pulse  without  a  failure.  Then,  a  failure  wiU  occur  only 
during  a  small  percentage  of  pulses,  and  only  that  small 
percentage  of  pulses  will  experience  any  downtime 


during  the  pulse.  Eor  this  case  there  will  not  be  much 
difference  between  the  Ap  and  the  steady-state  A^. 

If  the  MTBF  is  such  that  there  is  a  good  chance  of 
experiencing  one  or  more  failures  during  the  pulse,  and 
if  the  average  downtime  is  also  high  (relative  to  the 
pulse),  then  a  significant  portion  of  downtime  can  be 
expected  to  extend  beyond  the  pulse,  and  the  Ap  will 
differ  significantly  from  the  steady-state  A^. 

As  shown  in  Figure  3,  for  situations  where  many 
failure-restore  cycles  occur  during  the  pulse  (i.e.,  very 
long  pulses  or  low  MTBF  combined  with  low  down¬ 
times),  then  many  failure-restore  cycles  occur  during  the 
pulse.  In  that  case,  because  there  are  multiple  failures  and 
repairs  during  the  pulse,  most  of  the  associated 
downtime  occurs  during  the  pulse  and  only  downtime 
from  the  last  failure  can  extend  beyond  the  pulse.  In  this 
case,  there  wiU  some  but  perhaps  not  a  significant 
difference  between  the  Ap  and  steady-state  A^. 

We  begin  by  modifying  the  normal  Aq  equation  to 
include  only  the  uptime  and  downtime,  which  occurs 
during  the  pulse,  as  shown  in  Equation  3. 

^  Uptimep 

Uptimep  +  Downtimep  ’ 

where  Uptimep  (UTp)  is  the  uptime  within  the  pulse;  and 
Downtimep  (DTp)  is  the  downtime  within  the  pulse.  As 
we  have  discussed  previously,  the  difference  between  Ao 
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Downtime  outside 
pulse 


and  Ap  lies  in  the  fact  that  some  amount  of  the  expected 
downtime  extends  beyond  the  pulse.  The  downtime  that 
extends  beyond  the  pulse  is  designated  as  DTop. 

In  general,  the  percentage  of  time  we  can  expect  to 
be  operational  is  A^,  and  the  amount  of  time  we  can 
expect  to  be  operational  during  the  pulse,  UTp  =  A^  X 
PT.  Similarly,  the  amount  of  downtime  we  can  expect 
during  a  pulse  is  DTp  =  (1  —  A)  X  PT,  less  some 
amount  of  downtime  normally  expected  that  falls 
outside  the  pulse  (DTop). 

If  a  system  is  in  a  failed  state  at  the  end  of  a  pulse,  we 
win  assume  that  there  is  an  equal  chance  of  being  at  any 
point  in  the  restore  process.  So,  given  that  the  system  is  in 
a  failed  state,  the  average  amount  of  downtime  extending 
beyond  the  end  of  the  pulse  is  one-half  of  the  restore 
time,  or  (MCMT  -I-  ALDT)/2.  By  definition,  1  —  A^  is 
the  probability  that  the  system  is  nonoperational  at  any 
given  time,  including  the  end  of  the  pulse.  So,  there  is  a  1 
—  Ao  probability  that  an  (ALDT  -I-  MCMT)/2  amount 
of  downtime  will  fall  outside  the  pulse.  This  means  that 

DTp  =  estimated  downtime  for  pulse  less  expected 
downtime  outside  of  pulse 
=  [(l-A)xPT] 

-  [(1  -  A)  (MCMT  +  ALDT)  /2] . 

There  is  however  one  limitation  arising  from  our 
assumption  with  respect  to  the  average  amount  of 
downtime  extending  beyond  the  pulse.  If  the  time  to 
restore  (TTR)  is  large  with  respect  to  the  pulse  time 
and  we  end  the  pulse  in  a  failed  state,  then  the  system 
is  much  more  likely  to  be  early  in  the  restore  process 
and  our  assumption  that  one-half  of  the  restore  time 
extends  beyond  the  pulse  is  not  valid.  Take  for  example 
a  pulse  time  of  72  hours  and  a  TTR  of  200  hours. 
Using  the  previous  assumption,  we  would  effectively  be 
reducing  the  pulse  downtime  by  one-half  of  200  hours, 
or  100  hours — longer  than  the  actual  pulse  itself.  To 


overcome  this  anomaly,  we  limit  the  time  to  restore  to 
no  more  than  the  pulse  time  itself.  Simulation  results 
have  borne  out  that  no  matter  how  high  the  restore 
time,  once  it  exceeds  the  pulse  time  the  resultant  Ap  is 
not  affected.  In  that  case,  once  a  failure  occurs,  the 
failed  system  will  not  be  returned  to  operation  during 
the  pulse.  Now  we  can  substitute  into  Equation  3: 


UTp  +  DTp 

X  PT 

“  To  X  PT  +  (1  ) PT  - ( 1  )  (MCMT  +  ALDT)  /2  ’ 

dividing  numerator  and  denominator  by  PT  and 
simplifying. 


Ap  = 


To 

To  -h  (1  -To)  -  [( 1  -To )  (MCMT  -h  ALDT)72]  /PT  ’ 


l-[(l-To)xTTR/2PT]’  ^  ’ 

where  PT  =  pulse  time  period  (hours), 

TTR  =  MCMT  -t  ALDT 

if  (MCMT  -t  ALDT)  <  PT  or 
TTR  =  PT  if  (MCMT  -t  ALDT)  >  PT 

Note  that  Equation  4  is  valid  only  when  A^  >  0.50. 

Verifying  the  output  by  comparing  to  Monte  Carlo 
simulation  results,  it  turns  out  that  Equation  4  provides 
a  good  estimate  of  Tp  except  in  cases  of  very  low  Tq. 
When  To  is  less  than  50  percent.  Equation  4  begins  to 
deviate  significantly  from  simulated  results.  However, 
such  low  values  of  Tq  are  generally  not  acceptable  and 
therefore  not  applied  in  real  world  situations. 

The  previous  discussion  relating  to  instances  of  long 
TTR  leads  to  another  simple  methodology  for  estimat¬ 
ing  Tp  in  cases  where  the  average  time  to  restore  exceeds 
the  pulse  time.  If,  as  in  the  case  of  long  TTR,  we  know 
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Table  1.  Comparison  of  Equation  4  output  to  simulation  results. 

MTBF  (h) 

OPR 

Ap  (Equation  4) 

PT 

=  7,200  h 

PT  = 

720  h 

PT 

=  72  h 

Aq  (Equation  1) 

Calc. 

Sim. 

Calc. 

Sim. 

Calc. 

Sim. 

100 

10/24 

0.745 

0.746 

0.746 

0.756 

0.757 

0.872 

0.857 

100 

20/24 

0.594 

0.595 

0.595 

0.608 

0.607 

0.773 

0.752 

500 

10/24 

0.936 

0.936 

0.936 

0.939 

0.940 

0.971 

0.968 

500 

20/24 

0.880 

0.880 

0.880 

0.886 

0.886 

0.944 

0.941 

that  systems  will  not  be  returned  to  operation  during  the 
pulse,  we  can  use  reliability,  the  probability  of  complet¬ 
ing  the  pulse  without  a  failure,  to  estimate  the  yfp. 

Estimation  of  pulse  availability  for  long 
times  to  restore 

As  stated  previously,  when  the  average  time  to 
restore  is  longer  than  the  pulse  time,  a  failed  system 
will  (on  average)  not  be  returned  to  operation  during 
the  pulse.  Thus,  we  can  compute  the  A-p  using  mission 
reliability  as  described  further  on.  The  exact  reliability 
distribution  used  to  calculate  the  pulse  mission 
reliability  does  not  matter;  similar  calculations  can  be 
performed  for  any  distribution.  But,  for  simplicity,  we 
will  use  the  exponential  distribution. 

We  are  interested  in  the  probability  of  completing 
the  pulse  without  a  failure.  This  value  will  be 
designated  as  pulse  reliability  (Ap). 

The  derivation  is  straightforward  and  independent 
of  the  number  of  systems  operating  during  the  pulse. 


but  it  is  more  intuitive  if  we  let  N  be  the  number  of 
systems  beginning  the  pulse. 

For  a  group  of  N  systems,  UTp  is  equal  to  the 
number  of  systems  that  make  it  through  the  pulse 
{N  X  Rp)  multiplied  by  the  pulse  time  PT;  summed 
with  the  number  of  systems  that  failed  to  make  it 
through  the  pulse  [A/(l  —  Ap)],  multiplied  by  the 
uptime  they  accomplished  prior  to  failure,  UTpAiL- 

UTp  =  [A  X  Rp  X  PT]  +  [A(l  Ap)  X  UTp^p] 

UTpAiL  is  easy  to  estimate  for  the  exponential 
distribution  because  it  assumes  a  constant  failure  or 
hazard  rate,  which  means  that  failures  are  equally  likely 
to  occur  at  any  time  during  the  pulse.  Given  that  a 
system  has  failed  during  the  pulse,  this  lets  us  estimate 
the  average  time  to  failure  as  simply  one-half  of  the 

PT.  So,  UTfail  =  PT/2. 

Now, 

UTp  =  (A  X  Ap  X  PT)  +  A(1  -  Ap)  x  PT/2, 


Table  2.  Comparison  of  Equation  4  and  5  to  simulation  results. 


Avg.  time  to  restore 

(TTR) 

MTBF/OPR  (h) 

Aq  (Equation  1) 

Ap  (Equation  4) 

Sim.  result 

Ap  (Equation  5) 

25 

12 

0.324 

0.345 

0.344 

0.500  (poor  match) 

25 

60 

0.706 

0.724 

0.726 

0.545  (poor  match) 

25 

120 

0.828 

0.840 

0.841 

0.651  (poor  match) 

25 

600 

0.960 

0.963 

0.963 

0.893  (poor  match) 

82 

12 

0.128 

0.170 

0.163 

0.500  (poor  match) 

82 

60 

0.423 

0.506 

0.496 

0.545  (poor  match) 

82 

120 

0.594 

0.672 

0.658 

0.651 

82 

600 

0.880 

0.911 

0.909 

0.893 

162 

12 

0.069 

0.129 

0.083 

0.500  (poor  match) 

162 

24 

0.129 

0.229 

0.168 

0.501  (poor  match) 

162 

48 

0.229 

0.372 

0.314 

0.525  (poor  match) 

162 

72 

0.308 

0.471 

0.438 

0.568  (poor  match) 

162 

96 

0.372 

0.542 

0.543 

0.612  (poor  match) 

162 

120 

0.426 

0.597 

0.582 

0.651  (poor  match) 

162 

180 

0.526 

0.690 

0.685 

0.725  (marginal) 

162 

240 

0.597 

0.748 

0.748 

0.774  (marginal) 

162 

360 

0.690 

0.816 

0.819 

0.835 

162 

480 

0.748 

0.856 

0.869 

0.870 

162 

600 

0.787 

0.881 

0.887 

0.893 

162 

1,200 

0.881 

0.937 

0.942 

0.943 
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Table  3. 


Avg.  time  to 
restore  (TTR) 

MTBF/OPR  (h) 

Sim.  result  (Equation  5) 

82 

600 

0.907  0.893 

162 

600 

0.892 

242 

600 

0.887 

322 

600 

0.893 

402 

600 

0.891 

Ap  =  Uptime/Totaltime, 

(7Vx7?pxPT)+iV(l-/?p)  xPT/2 
iVxPT  ■ 

Divide  numerator  and  denominator  by  N  and  PT,  and 
we  get 

Ap  =  Rp  +  (l-Rp)/2.  (5) 

Note  that  Equation  5  is  valid  for  exponentially 
distributed  failures;  average  time  to  restore  greater 
than  pulse  time;  and  mean  calendar  time  between 
failure  greater  than  pulse  time. 

If  one  assumes  that  system  failures  follow  the 
exponential  distribution,  Equation  5  can  be  used  to 
quickly  estimate  pulse  availability  for  systems  with  long 
times  to  restore  (relative  to  the  pulse  time).  Also, 
analysis  of  Monte  Carlo  simulation  results  indicates 
that  Equation  5  provides  a  poor  approximation  when 
the  mean  calendar  time  between  failure  (MTBF/OPR) 
is  less  than  PT. 

Comparison  of  pulse  availability 
estimations  to  simulation  results 

A  rigorous  proof  of  these  formulas  is  not  provided. 
However,  it  will  be  shown  that  they  compare  favorably 
to  results  of  Monte  Carlo  simulations  performed  by  the 
author.  The  author  is  confident  that  these  results  can 
be  replicated  by  others  willing  to  do  so. 

Tal>/e  1  shows  results  from  the  use  of  Equations  1 
(y^o)  and  4  (Ap)  as  compared  with  Monte  Carlo 
simulation  results  for  various  inputs.  The  MTBFs  vary 
between  100  hours  and  500  hours;  daily  OT  varies 
between  10  hours  per  day  and  20  hours  per  day;  pulse 
times  vary  between  72  hours,  720  hours,  and  7,200 
hours.  For  all  cases,  ALDT  =  80  hours  and  MCMT  = 
2  hours.  Each  simulation  result  shown  represents  the 
average  from  a  group  of  10  systems  operating  for  the 
specified  pulse,  repeated  1,000  times. 

As  can  be  seen,  the  Ap  simulation  results  are  within  a 
few  percentage  points  of  the  calculated  values.  Note  that 
the  simulated  and  calculated  results  from  the  higher 
MTBF  are  more  comparable  with  each  other.  And,  as 


expected,  when  the  pulse  time  increases,  the  calculated 
Ap  (Equation  4)  approaches  the  steady-state  Ao  (Equa¬ 
tion  1)  and  very  closely  matches  the  simulation  results. 

Table  2  compares  the  results  of  Equations  1,  4,  and  5 
with  simulation  results  for  a  variety  of  cases. 

It  can  be  seen  that  Equation  5  does  not  match  the 
simulated  Ap  for  cases  where  either  the  TTR  is  less 
than  the  pulse  time,  or  when  the  MTBF/OPR  is  less 
than  the  pulse  time.  Table  3  shows  simulation  results 
for  increasing  TTR — showing  as  expected  no  affect  on 
Ap  when  the  TTR  is  increased  well  above  the  PT. 

Conclusion 

The  equations  and  methodologies  in  this  article 
describe  an  original  approach  to  estimation  of  pulse 
availability.  The  outputs  of  Equations  4  and  5  closely 
match  results  of  Monte  Carlo  simulations  written  to 
specifically  measure  Ap  over  typical  operating  cycles. 
The  only  limitations  are  that  Equation  4  is  valid  only 
when  the  operational  availability  is  above  0.50  (hardly 
an  actual  limitation).  Equation  5  is  valid  when  the 
failure  distribution  is  exponential;  and  the  reliability 
(MTBF)  and  average  time  to  restore  are  both  high 
relative  to  the  pulse  time.  /Uthough  various  techniques 
are  available  to  predict  A^,  very  little  can  be  found  on 
estimation  of  Ap,  and  the  author  hopes  to  fill  that  void 
with  a  simple  and  straightforward  approach.  □ 
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